Overview

Dataset statistics

Number of variables39
Number of observations347469
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory103.4 MiB
Average record size in memory312.0 B

Variable types

BOOL22
NUM9
CAT8

Warnings

building_id has unique values Unique
geo_level_1_id has 5358 (1.5%) zeros Zeros
age has 34725 (10.0%) zeros Zeros
count_families has 27937 (8.0%) zeros Zeros

Reproduction

Analysis started2020-09-29 02:10:49.731631
Analysis finished2020-09-29 02:12:06.114848
Duration1 minute and 16.38 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

building_id
Real number (ℝ≥0)

UNIQUE

Distinct347469
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525913.5838
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Memory size2.7 MiB
2020-09-29T10:12:06.380176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52200.8
Q1261999
median526071
Q3789588
95-th percentile1000694
Maximum1052934
Range1052930
Interquartile range (IQR)527589

Descriptive statistics

Standard deviation304354.4791
Coefficient of variation (CV)0.5787157595
Kurtosis-1.201737909
Mean525913.5838
Median Absolute Deviation (MAD)263777
Skewness0.001061379559
Sum1.827386671e+11
Variance9.263164894e+10
MonotocityNot monotonic
2020-09-29T10:12:06.561991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10526701< 0.1%
 
7844261< 0.1%
 
6820081< 0.1%
 
1618181< 0.1%
 
6840591< 0.1%
 
1536301< 0.1%
 
1515831< 0.1%
 
4152711< 0.1%
 
7680341< 0.1%
 
2596531< 0.1%
 
Other values (347459)347459> 99.9%
 
ValueCountFrequency (%) 
41< 0.1%
 
71< 0.1%
 
81< 0.1%
 
121< 0.1%
 
131< 0.1%
 
ValueCountFrequency (%) 
10529341< 0.1%
 
10529311< 0.1%
 
10529291< 0.1%
 
10529261< 0.1%
 
10529231< 0.1%
 

geo_level_1_id
Real number (ℝ≥0)

ZEROS

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.89731458
Minimum0
Maximum30
Zeros5358
Zeros (%)1.5%
Memory size2.7 MiB
2020-09-29T10:12:06.724652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.032596704
Coefficient of variation (CV)0.5779963213
Kurtosis-1.212221228
Mean13.89731458
Median Absolute Deviation (MAD)6
Skewness0.2736617617
Sum4828886
Variance64.52260981
MonotocityNot monotonic
2020-09-29T10:12:06.853679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
6324859.3%
 
26300028.6%
 
10293998.5%
 
17292658.4%
 
7255657.4%
 
8254657.3%
 
20227616.6%
 
21199445.7%
 
4194625.6%
 
27167864.8%
 
Other values (21)9633527.7%
 
ValueCountFrequency (%) 
053581.5%
 
135881.0%
 
212210.4%
 
399952.9%
 
4194625.6%
 
ValueCountFrequency (%) 
3035951.0%
 
295370.2%
 
283490.1%
 
27167864.8%
 
26300028.6%
 

geo_level_2_id
Real number (ℝ≥0)

Distinct1418
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.8380517
Minimum0
Maximum1427
Zeros53
Zeros (%)< 0.1%
Memory size2.7 MiB
2020-09-29T10:12:07.000614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median706
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.8756742
Coefficient of variation (CV)0.5882776991
Kurtosis-1.190841647
Mean701.8380517
Median Absolute Deviation (MAD)349
Skewness0.02506845849
Sum243866966
Variance170466.3224
MonotocityNot monotonic
2020-09-29T10:12:07.155505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3953671.5%
 
15833171.0%
 
18128050.8%
 
138726750.8%
 
15725480.7%
 
36323430.7%
 
46323100.7%
 
67322490.6%
 
53322170.6%
 
88321290.6%
 
Other values (1408)31950992.0%
 
ValueCountFrequency (%) 
053< 0.1%
 
12520.1%
 
390< 0.1%
 
44150.1%
 
531< 0.1%
 
ValueCountFrequency (%) 
14278< 0.1%
 
14263780.1%
 
14256110.2%
 
14248< 0.1%
 
14233< 0.1%
 

geo_level_3_id
Real number (ℝ≥0)

Distinct11861
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6258.84676
Minimum0
Maximum12567
Zeros4
Zeros (%)< 0.1%
Memory size2.7 MiB
2020-09-29T10:12:07.323644image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile614
Q13073
median6271
Q39414
95-th percentile11928
Maximum12567
Range12567
Interquartile range (IQR)6341

Descriptive statistics

Standard deviation3646.950564
Coefficient of variation (CV)0.582687307
Kurtosis-1.213916607
Mean6258.84676
Median Absolute Deviation (MAD)3173
Skewness0.0005991846632
Sum2174755225
Variance13300248.41
MonotocityNot monotonic
2020-09-29T10:12:07.494524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
91338560.2%
 
6338410.2%
 
6217010.2%
 
112466330.2%
 
114406140.2%
 
20056130.2%
 
77235940.2%
 
92295160.1%
 
24524470.1%
 
104454020.1%
 
Other values (11851)34125298.2%
 
ValueCountFrequency (%) 
04< 0.1%
 
16< 0.1%
 
22< 0.1%
 
313< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
125672< 0.1%
 
125662< 0.1%
 
125658< 0.1%
 
125647< 0.1%
 
1256332< 0.1%
 

count_floors_pre_eq
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.130578555
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size2.7 MiB
2020-09-29T10:12:07.633588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.72776061
Coefficient of variation (CV)0.3415788675
Kurtosis2.36001261
Mean2.130578555
Median Absolute Deviation (MAD)0
Skewness0.8418180575
Sum740310
Variance0.5296355054
MonotocityNot monotonic
2020-09-29T10:12:07.750420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
220902960.2%
 
37417121.3%
 
15370515.5%
 
471862.1%
 
530390.9%
 
62830.1%
 
752< 0.1%
 
83< 0.1%
 
91< 0.1%
 
ValueCountFrequency (%) 
15370515.5%
 
220902960.2%
 
37417121.3%
 
471862.1%
 
530390.9%
 
ValueCountFrequency (%) 
91< 0.1%
 
83< 0.1%
 
752< 0.1%
 
62830.1%
 
530390.9%
 

age
Real number (ℝ≥0)

ZEROS

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.53881353
Minimum0
Maximum995
Zeros34725
Zeros (%)10.0%
Memory size2.7 MiB
2020-09-29T10:12:07.898577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.52774868
Coefficient of variation (CV)2.770574072
Kurtosis157.3751623
Mean26.53881353
Median Absolute Deviation (MAD)10
Skewness12.19598992
Sum9221415
Variance5406.329825
MonotocityNot monotonic
2020-09-29T10:12:08.056261image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%) 
105168014.9%
 
154807413.8%
 
54504513.0%
 
204279212.3%
 
03472510.0%
 
25325869.4%
 
30239776.9%
 
35144204.2%
 
40140504.0%
 
5096192.8%
 
Other values (32)305018.8%
 
ValueCountFrequency (%) 
03472510.0%
 
54504513.0%
 
105168014.9%
 
154807413.8%
 
204279212.3%
 
ValueCountFrequency (%) 
99518510.5%
 
200140< 0.1%
 
1952< 0.1%
 
1905< 0.1%
 
1851< 0.1%
 

area_percentage
Real number (ℝ≥0)

Distinct86
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.017014467
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Memory size2.7 MiB
2020-09-29T10:12:08.212870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.388646483
Coefficient of variation (CV)0.5474165602
Kurtosis30.64344074
Mean8.017014467
Median Absolute Deviation (MAD)2
Skewness3.53162645
Sum2785664
Variance19.26021795
MonotocityNot monotonic
2020-09-29T10:12:08.375131image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
65595916.1%
 
74914014.1%
 
54355612.5%
 
83798810.9%
 
9295728.5%
 
4256757.4%
 
10210306.1%
 
11183905.3%
 
3156874.5%
 
12101482.9%
 
Other values (76)4032411.6%
 
ValueCountFrequency (%) 
1125< 0.1%
 
242751.2%
 
3156874.5%
 
4256757.4%
 
54355612.5%
 
ValueCountFrequency (%) 
1001< 0.1%
 
963< 0.1%
 
923< 0.1%
 
901< 0.1%
 
867< 0.1%
 

height_percentage
Real number (ℝ≥0)

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4347985
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Memory size2.7 MiB
2020-09-29T10:12:08.513541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.915555029
Coefficient of variation (CV)0.3524610948
Kurtosis13.53489828
Mean5.4347985
Median Absolute Deviation (MAD)1
Skewness1.762884329
Sum1888424
Variance3.669351069
MonotocityNot monotonic
2020-09-29T10:12:08.636357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%) 
510486930.2%
 
66183717.8%
 
45042714.5%
 
74736013.6%
 
3345359.9%
 
8184605.3%
 
2123483.6%
 
971462.1%
 
1059341.7%
 
1212460.4%
 
Other values (19)33071.0%
 
ValueCountFrequency (%) 
2123483.6%
 
3345359.9%
 
45042714.5%
 
510486930.2%
 
66183717.8%
 
ValueCountFrequency (%) 
3290< 0.1%
 
312< 0.1%
 
291< 0.1%
 
282< 0.1%
 
263< 0.1%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
t
288937 
n
47413 
o
 
11119
ValueCountFrequency (%) 
t28893783.2%
 
n4741313.6%
 
o111193.2%
 
2020-09-29T10:12:08.783191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:08.870793image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:08.966912image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

foundation_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
r
292374 
w
 
20048
u
 
18908
i
 
14182
h
 
1957
ValueCountFrequency (%) 
r29237484.1%
 
w200485.8%
 
u189085.4%
 
i141824.1%
 
h19570.6%
 
2020-09-29T10:12:09.097218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:09.191092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:09.300422image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
n
243975 
q
81905 
x
 
21589
ValueCountFrequency (%) 
n24397570.2%
 
q8190523.6%
 
x215896.2%
 
2020-09-29T10:12:09.423576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:09.508066image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:09.604637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
f
279591 
x
33109 
v
32731 
z
 
1334
m
 
704
ValueCountFrequency (%) 
f27959180.5%
 
x331099.5%
 
v327319.4%
 
z13340.4%
 
m7040.2%
 
2020-09-29T10:12:09.727360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:09.813648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:09.920701image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
q
220286 
x
58139 
j
52912 
s
 
16132
ValueCountFrequency (%) 
q22028663.4%
 
x5813916.7%
 
j5291215.2%
 
s161324.6%
 
2020-09-29T10:12:10.049055image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:10.141362image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:10.427270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
s
269463 
t
57258 
j
 
17647
o
 
3101
ValueCountFrequency (%) 
s26946377.6%
 
t5725816.5%
 
j176475.1%
 
o31010.9%
 
2020-09-29T10:12:10.549256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:10.628073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:10.727467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
d
333327 
q
 
7641
u
 
4909
c
 
450
s
 
449
Other values (5)
 
693
ValueCountFrequency (%) 
d33332795.9%
 
q76412.2%
 
u49091.4%
 
c4500.1%
 
s4490.1%
 
a3530.1%
 
o1950.1%
 
m64< 0.1%
 
n54< 0.1%
 
f27< 0.1%
 
2020-09-29T10:12:10.850980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:10.949811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:11.097508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
316554 
1
 
30915
ValueCountFrequency (%) 
031655491.1%
 
1309158.9%
 
2020-09-29T10:12:11.175973image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
1
264798 
0
82671 
ValueCountFrequency (%) 
126479876.2%
 
08267123.8%
 
2020-09-29T10:12:11.220771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
335528 
1
 
11941
ValueCountFrequency (%) 
033552896.6%
 
1119413.4%
 
2020-09-29T10:12:11.273228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
341104 
1
 
6365
ValueCountFrequency (%) 
034110498.2%
 
163651.8%
 
2020-09-29T10:12:11.319001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
323848 
1
 
23621
ValueCountFrequency (%) 
032384893.2%
 
1236216.8%
 
2020-09-29T10:12:11.362948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
321440 
1
 
26029
ValueCountFrequency (%) 
032144092.5%
 
1260297.5%
 
2020-09-29T10:12:11.414563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
258995 
1
88474 
ValueCountFrequency (%) 
025899574.5%
 
18847425.5%
 
2020-09-29T10:12:11.460457image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
318046 
1
 
29423
ValueCountFrequency (%) 
031804691.5%
 
1294238.5%
 
2020-09-29T10:12:11.507681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
332678 
1
 
14791
ValueCountFrequency (%) 
033267895.7%
 
1147914.3%
 
2020-09-29T10:12:11.556017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
341964 
1
 
5505
ValueCountFrequency (%) 
034196498.4%
 
155051.6%
 
2020-09-29T10:12:11.601421image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
342243 
1
 
5226
ValueCountFrequency (%) 
034224398.5%
 
152261.5%
 
2020-09-29T10:12:11.645547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
v
334633 
a
 
7307
w
 
3539
r
 
1990
ValueCountFrequency (%) 
v33463396.3%
 
a73072.1%
 
w35391.0%
 
r19900.6%
 
2020-09-29T10:12:11.742262image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-29T10:12:11.833186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:11.932160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

count_families
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9837395566
Minimum0
Maximum9
Zeros27937
Zeros (%)8.0%
Memory size2.7 MiB
2020-09-29T10:12:12.045694image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4193854935
Coefficient of variation (CV)0.4263176069
Kurtosis17.24872251
Mean0.9837395566
Median Absolute Deviation (MAD)0
Skewness1.627559333
Sum341819
Variance0.1758841922
MonotocityNot monotonic
2020-09-29T10:12:12.145010image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
130137786.7%
 
0279378.0%
 
2150104.3%
 
324150.7%
 
45470.2%
 
5135< 0.1%
 
633< 0.1%
 
78< 0.1%
 
94< 0.1%
 
83< 0.1%
 
ValueCountFrequency (%) 
0279378.0%
 
130137786.7%
 
2150104.3%
 
324150.7%
 
45470.2%
 
ValueCountFrequency (%) 
94< 0.1%
 
83< 0.1%
 
78< 0.1%
 
633< 0.1%
 
5135< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
308630 
1
38839 
ValueCountFrequency (%) 
030863088.8%
 
13883911.2%
 
2020-09-29T10:12:12.225137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
325124 
1
 
22345
ValueCountFrequency (%) 
032512493.6%
 
1223456.4%
 
2020-09-29T10:12:12.270435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
335764 
1
 
11705
ValueCountFrequency (%) 
033576496.6%
 
1117053.4%
 
2020-09-29T10:12:12.317092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
344642 
1
 
2827
ValueCountFrequency (%) 
034464299.2%
 
128270.8%
 
2020-09-29T10:12:12.362841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347136 
1
 
333
ValueCountFrequency (%) 
034713699.9%
 
13330.1%
 
2020-09-29T10:12:12.408878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347343 
1
 
126
ValueCountFrequency (%) 
0347343> 99.9%
 
1126< 0.1%
 
2020-09-29T10:12:12.454747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347103 
1
 
366
ValueCountFrequency (%) 
034710399.9%
 
13660.1%
 
2020-09-29T10:12:12.502133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347411 
1
 
58
ValueCountFrequency (%) 
0347411> 99.9%
 
158< 0.1%
 
2020-09-29T10:12:12.548536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347421 
1
 
48
ValueCountFrequency (%) 
0347421> 99.9%
 
148< 0.1%
 
2020-09-29T10:12:12.593635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347442 
1
 
27
ValueCountFrequency (%) 
0347442> 99.9%
 
127< 0.1%
 
2020-09-29T10:12:12.638615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
345709 
1
 
1760
ValueCountFrequency (%) 
034570999.5%
 
117600.5%
 
2020-09-29T10:12:12.683088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-09-29T10:11:42.661171image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:42.880299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:43.073623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:43.274411image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:43.460109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:43.671374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:43.869887image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:44.067689image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:44.256400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:44.440934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:44.638638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:44.845816image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:45.050800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:45.245163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:45.470091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:45.682513image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:45.885623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:46.078352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:46.277313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:46.483280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:46.794403image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:47.006430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:47.215451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:47.430556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:47.637353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:47.841108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:48.039884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:48.240829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:48.437646image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:48.637474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:48.843204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:49.047127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:49.259365image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:49.461736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:49.663737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:49.853321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:50.045136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:50.252362image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:50.459678image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:50.672950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:50.880985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:51.087510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:51.290412image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:51.496278image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:51.703808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:51.907064image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:52.101180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:52.298716image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:52.500132image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:52.694565image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:53.008300image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:53.205868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:53.399100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:53.590479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:53.781465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:53.977959image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:54.178988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:54.382847image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:54.580258image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:54.794549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:54.993031image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:55.197581image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:55.393022image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:55.590188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:55.794274image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:56.010716image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:56.226950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:56.431468image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:56.637056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:56.841004image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:57.034541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:57.227397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:57.427335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:57.624560image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:57.815972image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:58.013275image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:58.202363image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:58.400226image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:58.585859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:58.774000image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:11:58.953133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-09-29T10:12:12.818849image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-29T10:12:13.345940image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-29T10:12:13.877944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-29T10:12:14.472515image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-09-29T10:12:14.978910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-09-29T10:11:59.844963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-29T10:12:02.594968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
080290664871219823065trnfqtd11000000000v100000000000
1288308900281221087ornxqsd01000000000v100000000000
29494721363897321055trnfxtd01000000000v100000000000
3590882224181069421065trnfxsd01000011000v100000000000
420194411131148833089trnfxsd10000000000v100000000000
53330208558608921095trnfqsd01000000000v111000000000
672845194751206622534nrnxqsd01000000000v100000000000
747551520323122362086twqvxsu00000110000v100000000000
84411260757721921586trqfqsd01000010000v100000000000
99895002688699410134tinvjsd00000100000v100000000000

Last rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
34745929084217114958073555trnfqsd01000011000v100000000000
347460330371455199622595trqfqsd01000000000v100000000000
34746169861220173518321095twqfxsu00000010000v100000000000
3474624451926460925825146trxxssd00000000100v100000000000
3474636401157116640625165tixvssd00000000010v210100000000
34746431002846053623370206trqfqtd01000010000w111000000000
3474656635671014071190732567nrnfqsd11100000000v100000000000
3474661049160221136771215033trnfjsd01000010000v100000000000
347467442785610419122595trnfqsd11000000000a100000000000
34746850137226366436210114trqvqsd00000100000v100000000000